On Model Selection Consistency of Lasso On Model Selection Consistency of Lasso

نویسندگان

  • Peng Zhao
  • Bin Yu
چکیده

Sparsity or parsimony of statistical models is crucial for their proper interpretations, as in sciences and social sciences. Model selection is a commonly used method to find such models, but usually involves a computationally heavy combinatorial search. Lasso (Tibshirani, 1996) is now being used as a computationally feasible alternative to model selection. Therefore it is important to study Lasso for model selection purposes. In this paper, we prove that a single condition, which we call the Irrepresentable Condition, is almost necessary and sufficient for Lasso to select the true model both in the classical fixed p setting and in the large p setting as the sample size n gets large. Based on these results, sufficient conditions that are verifiable in practice are given to relate to previous works and help applications of Lasso for feature selection and sparse representation. This Irrepresentable Condition, which depends mainly on the covariance of the predictor variables, states that Lasso selects the true model consistently if and (almost) only if the predictors that are not in the true model are “irrepresentable” (in a sense to be clarified) by predictors that are in the true model. Furthermore, simulations are carried out to provide insights and understanding of this result.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Estimation Consistency of the Group Lasso and its Applications

We extend the `2-consistency result of (Meinshausen and Yu 2008) from the Lasso to the group Lasso. Our main theorem shows that the group Lasso achieves estimation consistency under a mild condition and an asymptotic upper bound on the number of selected variables can be obtained. As a result, we can apply the nonnegative garrote procedure to the group Lasso result to obtain an estimator which ...

متن کامل

The Lasso under Heteroscedasticity

Lasso is a popular method for variable selection in regression. Much theoretical understanding has been obtained recently on its model selection or sparsity recovery properties under sparse and homoscedastic linear regression models. Since these standard model assumptions are often not met in practice, it is important to understand how Lasso behaves under nonstandard model assumptions. In this ...

متن کامل

Autoregressive process modeling via the Lasso procedure

The Lasso is a popular model selection and estimation procedure for linear models that enjoys nice theoretical properties. In this paper, we study the Lasso estimator for fitting autoregressive time series models. We adopt a double asymptotic framework where the maximal lag may increase with the sample size. We derive theoretical results establishing various types of consistency. In particular,...

متن کامل

Feature Selection Guided by Structural Information

In generalized linear regression problems with an abundant number of features, lasso-type regularization which imposes an `-constraint on the regression coefficients has become a widely established technique. Crucial deficiencies of the lasso were unmasked when Zhou and Hastie (2005) introduced the elastic net. In this paper, we propose to extend the elastic net by admitting general nonnegative...

متن کامل

Split LBI: An Iterative Regularization Path with Structural Sparsity

An iterative regularization path with structural sparsity is proposed in this paper based on variable splitting and the Linearized Bregman Iteration, hence called Split LBI. Despite its simplicity, Split LBI outperforms the popular generalized Lasso in both theory and experiments. A theory of path consistency is presented that equipped with a proper early stopping, Split LBI may achieve model s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006